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Data clustering: a review 

A. K. Jain, M. N. Murty, P. J. Flynn 

September 1999 ACM Computing Surveys (CSUR), volume 3i issue 3 
Publisher: ACM Press 

Additional Information: f ull c it ation , abstract, reference s, citin gs, index 
terms, r eview 



Full text available:^ pdf( 636.24 KB) 



Clustering is the unsupervised classification of patterns (observations, data items, or 
feature vectors) into groups (clusters). The clustering problem has been addressed in 
many contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a 
difficult problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, 
incremental clustering, similarity indices, unsupervised learning 



2 Artic les o n micr o array d ata minin g: Towa rd s in te ra ctive exploration o f gene 
expression patterns 
Daxin Jiang, Jian Pei, Aidong Zhang 

December 2003 ACM SIGKDD Explorations Newsletter volume 5 issue 2 
Publisher: ACM Press 

Full text available: pdf(5 27.68 KB ) Additional Information: full citation , abstract , references 

Analyzing coherent gene expression patterns is an important task in bioinformatics 
research and biomedical applications. Recently, various clustering methods have been 
adapted or proposed to identify clusters of co-expressed genes and recognize coherent 
expression patterns as the centroids of the clusters. However, the interpretation of co- 
expressed genes and coherent patterns mainly depends on the domain knowledge, which 
presents several challenges for coherent pattern mining and cannot be solv ... 

3 Hierarchical scene structure representations to facilitate image understanding 
A. J. Maren, M. AN 

June 1988 Proceedings of the 1st international conference on Industrial and 

engineering applications of artificial intelligence and expert systems 
Volume 2 IEA/AIE '88 

Publisher: ACM Press 
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Seein g, h earing, an d to uc hing: putting it all together 

Brian Fisher, Sidney Fels, Karon MacLean, Tamara Munzner, Ronald Rensink 
August 2004 ACM SIGGRAPH 2004 Course Notes SIGGRAPH '04 

Publisher: ACM Press 

Full text available:^ pdf(20.6 4 MB) Additional Information: full citation 



Selected writings on computing: a personal perspective 
Edsger W. Dijkstra 
January 1982 Book 

Publisher: Springer-Verlag New York, Inc. 

Additional Information: full citation, a bstra ct, re ferences , c it ed by, in d e x ter ms 

Since the summer of 1973, when I became a Burroughs Research Fellow, my life has 
been very different from what it had been before. The daily routine changed: instead of 
going to the University each day, where I used to spend most of my time in the company 
of others, I now went there only one day a week and was most of the time that is, when 
not travelling!-- alone in my study. In my solitude, mail and the written word in general 
became more and more important. The circumstance that my employe ... 

Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and 
Particle Swarm Optimization with Gene Expression D a t a 
Rui Xu, Georgios C. Anagnostopoulos, Donald C. Wunsch 

January 2007 IEEE/ACM Transactions on Computational Biology and Bioinformatics 

(TCBB), Volume 4 Issue 1 
Publisher: IEEE Computer Society Press 

Full text available: ^ pdf(3.70 MB) Additional Information: full citation , abstract , references , index terms 

It is crucial for cancer diagnosis and treatment to accurately identify the site of origin of a 
tumor. With the emergence and rapid advancement of DNA microarray technologies, 
constructing gene expression profiles for different cancer types has already become a 
promising means for cancer classification. In addition to research on binary classification 
such as normal versus tumor samples, which attracts numerous efforts from a variety of 
disciplines, the discrimination of multiple tumor types is ... 

Keywords: Cancer classification, gene expression profile, semisupervised ellipsoid 
ARTMAP, particle swarm optimization. 



Ecolo g ical in t erface ena bl i n g huma n-embod i ed cognition i n mobile robot 
teleoperation 

Tetsuo Sawaragi, Yukio Horiguchi 
September 2000 intelligence, volume n issue 3 

Publisher: ACM Press 

Full text available: gjBdfflSaMBLll Addjtiona , information: full citation, references, index terms 
h tm l( 38.16 KB) 



Book review: Competitively Inhibited Neural Networks for Adaptive Parameter 
Estimation by Mi ch a e l L emmon (Kluw er Academic Publishers, 199 1) 
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Joseph M. Barone 
<i> October 1992 ACM SIGART Bulletin, volume 3 issue 4 

Publisher: ACM Press 

Full text available: ^| pdf(566.62 KB) Additional Information: full citation , abstract , references 

Rigorous, formal treatments of neural network fundamentals (i.e., treatments whose 
arguments consist primarily of theorems and proofs) have by now focused on a number of 
aspects. The convergence properties (stability) and capacity of neural nets of various 
types have been analyzed in this manner to one degree or another (e.g., [1-3]), and their 
expressive power has also been the subject of a number of formal analyses (e.g. ,[4]). 
Though not necessarily perfectly rigorous in the sense just mention ... 

9 A t e m plate-based and pattern-driv e n ap pro ac h to situation awarene ss and 
assessm e nt in virtual h u m ans 
Weixiong Zhang, Randal W. Hill 

June 2000 Proceedings of the fourth international conference on Autonomous agents 
AGENTS 00 

Publisher: ACM Press 

Full text available: ^gpdf(1. 20 MB) Additional Information: full citation, references, index terms 




Keywords: organizational and spatial relationship, pattern matching, perception, 
situation awareness and assessment, templates 



10 OPTICS: orderin g points to i dent ify t h e c lustering str u ct ur e 
Jj^ Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jorg Sander 

^ June 1999 ACM SIGMOD Record , Proceedings of the 1999 ACM SIGMOD international 
conference on Management of data SIGMOD '99, volume 28 issue 2 
Publisher: ACM Press 

Full text available: ® pdf(1 .77 MB) Additional ,nformation: Mlcitation, abstract, references , citings, index 
123 terms 

Cluster analysis is a primary method for database mining. It is either used as a stand- 
alone tool to get insight into the distribution of a data set, e.g. to focus further analysis 
and data processing, or as a preprocessing step for other algorithms operating on the 
detected clusters. Almost all of the well-known clustering algorithms require input 
parameters which are hard to determine but have a significant influence on the clustering 
result. Furthermore, for many real-data sets there doe ... 

Keywords: cluster analysis, database mining, visualization 



11 iVIBRATE: Interactive visualization-based framework for clustering large datasets Q 
Keke Chen, Ling Liu 

April 2006 ACM Transactions on Information Systems (TOIS), volume 24 issue 2 
Publisher: ACM Press 

Full text available: ^ pdf(4. 48 MB) Additional Information: full citati on, abstract , references , i nd e x t erms 

With continued advances in communication network technology and sensing technology, 
there is astounding growth in the amount of data produced and made available through 
cyberspace. Efficient and high-quality clustering of large datasets continues to be one of 
the most important problems in large-scale data analysis. A commonly used methodology 
for cluster analysis on large datasets is the three-phase framework of 
sampling/summarization, iterative cluster analysis, and disk-labeling. There are th ... 
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Keywords: Clustering, interactive visualization, labeling, large datasets, performance 



12 Bayesian Netw ork Learning with Parameter Constraints Q 
Radu Stefan Niculescu, Tom M. Mitchell, R. Bharat Rao 
December 2006 The Journal of Machine Learning Research, volume i 

Publisher: MIT Press 

Full text available: |f?| pd f(3 00.61 KB ) Additional Information: fu ll cita ti o n, ab s tract 

The task of learning models for many real-world problems requires incorporating domain 
knowledge into learning algorithms, to enable accurate learning from a realistic volume of 
training data. This paper considers a variety of types of domain knowledge for 
constraining parameter estimates when learning Bayesian networks. In particular, we 
consider domain knowledge that constrains the values or relationships among subsets of 
parameters in a Bayesian network with known structure. 



We inco ... 



13 Exploiting in h erit a nce and st ru cture semantics for e f fec t i v e c lustering and buffering in jjjjj 

an obj ect-oriented DBMS 
V E. E. Chang, R. H. Katz 

June 1989 ACM SIGMOD Record , Proceedings of the 1989 ACM SIGMOD international 

conference on Management of data SIGMOD '89, volume 18 issue 2 
Publisher: ACM Press 



Full text available: gpdfd .21 MB) 



Additional Information: full ci tat i o n, abstrac t, refe re nc e s , citings, ind ex 
terms 



Object-oriented databases provide new kinds of data semantics in terms of inheritance 
and structural relationships. This paper examines how to use these additional semantics 
to obtain more effective object buffering and clustering. We use the information collected 
from real-world object-oriented applications, the Berkeley CAD Group's OCT design tools, 
as the basis for a simulation model with which to investigate alternative buffering and 
clustering strategies. Observing from our measurement ... 

14 S pecial issue on ICML: Coupled clustering: a method for detecting structural 
c o r res pond e nce 

Zvika Marx, Ido Dagan, Joachim M. Buhmann, Eli Shamir 

March 2003 The Journal of Machine Learning Research, volume 3 

Publisher: MIT Press 

Full text available: ^pdf(967.15 KB) Additional Information: f u ll cit ation , a b s tract, cit in gs, index terms 

This paper proposes a new paradigm and a computational framework for revealing 
equivalencies (analogies) between sub-structures of distinct composite systems that are 
initially represented by unstructured data sets. For this purpose, we introduce and 
investigate a variant of traditional data clustering, termed coupled clustering, which 
outputs a configuration of corresponding subsets of two such representative sets. We 
apply our method to synthetic as well as textual data. Its achievement ... 



15 Clustering: Efficiently c lu st ering transactional data w it h wei g hted cove rage density |jj 
Hua Yan, Keke Chen, Ling Liu 

November 2006 Proceedings of the 15th ACM international conference on Information 
and knowledge management CIKM '06 

Publisher: ACM Press 

Full text available: ^ pdf (367.51 KB ) Additional Information: full cit ation, a bstrac t, r efe renc es , index terms 
It is widely recognized that developing efficient and fully automated algorithms for 
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clustering large transactional datasets is a challenging problem. In this paper, we propose 
a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional 
data. Our approach has three unique features. First, we use the concept of Weighted 
Coverage Density as a categorical similarity measure for efficient clustering of 
transactional datasets. The concept of weighted coverage density is in ... 

Keywords: AMI, LISR, SCALE, weighted coverage density 



16 S pecial iss ue on th e fusion o f domain knowled g e with da t a for decision su pport: 
Fusion of domai n knowl ed g e with data fo r st ructural learnin g in obj ect orie nted 
domains 

Helge Langseth, Thomas D. Nielsen 

December 2003 The Journal of Machine Learning Research, volume 4 
Publisher: MIT Press 

i- I.* ^ i ui 0i ^7,0^ Additional Information: full citation, abstract, references .index terms, 
Full text available: TCI pdf(227.18 KB) -; - ~ ™ " " 

^ review 

When constructing a Bayesian network, it can be advantageous to employ structural 
learning algorithms to combine knowledge captured in databases with prior information 
provided by domain experts. Unfortunately, conventional learning algorithms do not easily 
incorporate prior information, if this information is too vague to be encoded as properties 
that are local to families of variables. For instance, conventional algorithms do not exploit 
prior information about repetitive structures, which are ... 

17 Extern a l and internal representati o ns appropri a te for ART neural networks 
M. Cader, D. Benachenhou, L Medsker, H. Szu 

>^ September 1990 Proceedings of the 1990 ACM SIGBDP conference on Trends and 
directions in expert systems SIGBDP '90 
Publisher: ACM Press 

Full text available: ^ pdf(691.54 KB) Additional Information: full citation , references , index terms 



18 F ast detection of communication patterns in distributed executions Q 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 
Studies on Collaborative research CASCON '97 

Publisher: IBM Press 

Full text available: Q pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 

19 Pa per session KM-3 (knowled g e ma nage me nt ): classificat i o n & clusteri n g: Versatile Q 
structural disambi g uation for semant i c-aware a p plications 
Federica Mandreoli, Riccardo Martoglia, Enrico Ronchetti 

October 2005 Proceedings of the 14th ACM international conference on Information 
and knowledge management CIKM '05 

Publisher: ACM Press 

Full text available: ^pdf(216.35 KB ) Additional Information: f ull cita ti on , abstract , ref e r ences , index te r m s 
In this paper, we propose a versatile disambiguation approach which can be used to make 
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explicit the meaning of structure based information such as XML schemas, XML document 
structures, web directories, and ontologies. It can be of support to the semantic- 
awareness of a wide range of applications, from schema matching and query rewriting to 
peer data management systems, from XML data clustering to ontology-based automatic 
annotation of web pages and query expansion. The effectiveness of the achi ... 

Keywords: semantic web, structure based information, word sense disambiguation 



20 Data c l eanin g and inte gratio n : L everaging dat a a nd struc t ur e in o nto l o g y inte g ratio n Q 
Octavian Udrea, Use Getoor, Renee J. Miller 

June 2007 Proceedings of the 2007 ACM SIGMOD international conference on 
Management of data SIGMOD '07 

Publisher: ACM Press 

Full text available: ^p| pdf(462.13 KB ) Additional Information: f ull citat ion, abstract, re feren ces, index terms 

There is a great deal of research on ontology integration which makes use of rich logical 
constraints to reason about the structural and logical alignment of ontologies. There is 
also considerable work on matching data instances from heterogeneous schema or 
ontologies. However, little work exploits the fact that ontologies include both data and 
structure. We aim to close this gap by presenting a new algorithm (ILIADS) that tightly 
integrates both data matching and logical reasoning to achieve ... 

Keywords: data integration, logical inference, ontology alignment, schema mapping, 
statistical inference 
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June 2006 Computational Linguistics, volume 32 issue 2 
Publisher: MIT Press 

Full text available: *g| pdf(3.6 0 M B ) Additional Information: ful l citati o n, abstract , r efe re nces , cite d b y 

The initial knowledge base is later enriched with information from other machine-readable 
dictionaries. Information about the collocational behavior of the near-synonyms is 
acquired from free text. The knowledge base is used by Xenon, a natural language 
generation system that shows how the new lexical resource can be used to choose the 
best near-synonym in specific situations. 



2 W eb 1 --exploitin g gra ph structure: Respect m y au thority!: HITS wi t ho ut h yperlinks, Q 
utilizing cluster-based language models 
Oren Kurland, Lillian Lee 

August 2006 Proceedings of the 29th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '06 

Publisher: ACM Press 

Full text available: ^| pd f( 210.69 KB ) Additional Information: f ul l cita ti on , abs tract, references, in dex ter m s 

We present an approach to improving the precision of an initial document ranking wherein 
we utilize cluster information within a graph-based framework. The main idea is to 
perform reranking based on centrality within bipartite graphs of documents (on one side) 
and clusters (on the other side), on the premise that these are mutually reinforcing 
entities. Links between entities are created via consideration of language models induced 
from them. We find that our cluster-document graphs give r ... 

Keywords: HITS, authorities, bipartite graph, cluster-based language models, clusters, 
graph-based retrieval, high-accuracy retrieval, hubs, language modeling, structural re- 
ranking 




3 Multiclass Cancer C la s sification Usin g Semisupervised Ellipsoid ARTMAP and Q 
Particle Swarm O p tim i zation w i t h Ge ne E xpres sion Data 
Rui Xu, Georgios C. Anagnostopoulos, Donald C. Wunsch 

January 2007 IEEE/ACM Transactions on Computational Biology and Bioinformatics 

(TCBB), Volume 4 Issue 1 
Publisher: IEEE Computer Society Press 
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Full text available: ^f|pdf(3.7Q MB) Additional Information: full citation, abstract;, references, index terms 

It is crucial for cancer diagnosis and treatment to accurately identify the site of origin of a 
tumor. With the emergence and rapid advancement of DNA microarray technologies, 
constructing gene expression profiles for different cancer types has already become a 
promising means for cancer classification. In addition to research on binary classification 
such as normal versus tumor samples, which attracts numerous efforts from a variety of 
disciplines, the discrimination of multiple tumor types is ... 

Keywords: Cancer classification, gene expression profile, semisupervised ellipsoid 
ARTMAP, particle swarm optimization. 



4 Collective entity resolution in relational data Q 
>4k Indrajit Bhattacharya, Use Getoor 

N/ March 2007 ACM Transactions on Knowledge Discovery from Data (TKDD), volume l 
Issue 1 
Publisher: ACM Press 

Full text available: ||]p df (5 1l57 KB) Additional Information: f u l l c i tation , a b stra ct , r e f erences , ind ex t er m s 

Many databases contain uncertain and imprecise references to real-world entities. The 
absence of identifiers for the underlying entities often results in a database which contains 
multiple references to the same entity. This can lead not only to data redundancy, but 
also inaccuracies in query processing and knowledge extraction. These problems can be 
alleviated through the use of entity resolution. Entity resolution involves discovering the 
underlying entities and mapping each database ... 

Keywords: Entity resolution, data cleaning, graph clustering, record linkage 



Automatic ve rb c las s ifica tion bas ed on st a t istical d i str i butions of ar g um ent st r u c ture 
Paola Merlo, Suzanne Stevenson 

September 2001 Computational Linguistics, volume 27 issue 3 
Publisher: MIT Press 

Full text available: fO pdf(341 .42 KB) 

s Additional Information: full citation, a bst ract, refere n ce s, cit ings 

Publ ishe r Site 

Automatic acquisition of lexical knowledge is critical to a wide range of natural language 
processing tasks. Especially important is knowledge about verbs, which are the primary 
source of relational information in a sentence—the predicate-argument structure that 
relates an action or state to its participants (i.e., who did what to whom). In this work, we 
report on supervised learning experiments to automatically classify three major types of 
English verbs, based on their argument structure—sp ... 

IR-KM-1 (information retrieval and knowledge management): text mining: Event 
threa din g with in news top ics 

Ramesh Nallapati, Ao Feng, Fuchun Peng, James Allan 

November 2004 Proceedings of the thirteenth ACM international conference on 
Information and knowledge management CIKM '04 

Publisher: ACM Press 

i- ii* ^ •. ui 01 , f mo 07 m Additional Information: full citation, abstract, refercjnc^, citings, jndfex 

Full text available: TSI pdf(1 23. 27 KB) - ~~~ * 

terms 

With the overwhelming volume of online news available today, there is an increasing need 
for automatic techniques to analyze and present news to the user in a meaningful and 
efficient manner. Previous research focused only on organizing news stories by their 
topics into a flat hierarchy. We believe viewing a news topic as a flat collection of stories 
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is too restrictive and inefficient for a user to understand the topic quickly. 

In this work, we attempt to capture the rich structure of ... 
Keywords: clustering, dependency, event, threading 



K no wledge management session 4: indexing: Bootstrapping for hierarchical 

document classification 

Giordano Adami, Paolo Avesani, Diego Sona 

November 2003 Proceedings of the twelfth international conference on Information 
and knowledge management CIKM '03 

Publisher: ACM Press 

i- ii* ^ -i ui 0) jihohto^ Additional Information: full citation, abstract, references, citings, index 

Full text available: pdf(18073 KB) - 

*— ^ terms 

Managing the hierarchical organization of data is starting to play a key role in the 
knowledge management community due to the great amount of human resources needed 
to create and maintain these organized repositories of information. Machine learning 
community has in part addressed this problem by developing hierarchical supervised 
classifiers that help maintainers to categorize new resources within given hierarchies. 
Although such learning models succeed in exploiting relational knowledge, they ... 

Keywords: TaxSOM, constrained clustering, k-means, taxonomy bootstrapping process, 
text categorization 



8 Legal knowledg e b a se s 3: doc u m e nt retriev a l: Effecti ve d oc u m e n t cluste ring for large Q 

heterogeneous law firm collections 
jack G. Conrad, Khalid Al-Kofahi, Ying Zhao, George Karypis 

June 2005 Proceedings of the 10th international conference on Artificial intelligence 
and law ICAIL '05 

Publisher: ACM Press 

Full text available: gpdf(51 7.35 KB) Additional Information: ful l citation , ab stract , references , index te rms 

Computational resources for research in legal environments have historically implied 
remote access to large databases of legal documents such as case law, statutes, law 
reviews and administrative materials. Today, by contrast, there exists enormous growth 
in lawyers' electronic work product within these environments, specifically within law 
firms. Along with this growth has come the need for accelerated knowledge management- 
--automated assistance in organizing, analyzing, retrieving and presenti ... 

Keywords: document clustering, knowledgement management, legal data, taxonomy 
development 




U s in g multiple knowled ge sou rces for wor d sense discrim i nat i on 
Susan W. McRoy 

March 1992 Computational Linguistics, volume is issue l 
Publisher: MIT Press 

Full text available:^ rt0 [fjj] e 

Tg.pdf(2.02 MBjJflJ' Additional Information: full citation , abstrac t, references , citin gs 

Pu bl i sher S ite 

This paper addresses the problem of how to identify the intended meaning of individual 
words in unrestricted texts, without necessarily having access to complete representations 
of sentences. To discriminate senses, an understander can consider a diversity of 
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information, including syntactic tags, word frequencies, collocations, semantic context, 
role-related expectations, and syntactic restrictions. However, current approaches make 
use of only small subsets of this information. Here we will des ... 

1 0 Answering C l inical Question s with Know ledge- Based and Sta tis t i ca l Techniqu es 
Dina Demner-Fushman, Jimmy Lin 

March 2007 Computational Linguistics, volume 33 issue 1 

Publisher: MIT Press 

Full text available: ^ [pdf(295.45 KB) Additional Information: full citation , abstract , index terms 

The combination of recent developments in question-answering research and the 
availability of unparalleled resources developed specifically for automatic semantic 
processing of text in the medical domain provides a unique opportunity to explore 
complex question answering in the domain of clinical medicine. This article presents a 
system designed to satisfy the information needs of physicians practicing evidence-based 
medicine. We have developed a series of knowledge extractors, which employ a ... 

11 D ata c lu ster ing : a review 

^ A. K. Jain, M. N. Murty, P. J. Flynn 

September 1999 ACM Computing Surveys (CSUR), volume 31 issue 3 
Publisher: ACM Press 

114 . -, ui « ./n\ Additional Information: full citation, abstract, references, citings, index 

Full text available: TS1 pdf(636. 24 KB ) 7 ■— — ~ — °~ 

^ terms, re view 

Clustering is the unsupervised classification of patterns (observations, data items, or 
feature vectors) into groups (clusters). The clustering problem has been addressed in 
many contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a 
difficult problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, 
incremental clustering, similarity indices, unsupervised learning 



12 Long paper s: k n owled ge ac qui sit i on and knowledge-based design: Sugge s t in g no v el Q 

but related to pi cs: towards conte xt-b ase d supp o r t for kn o w ledge model extension 

^ Ana Maguitman, David Leake, Thomas Reichherzer 

January 2005 Proceedings of the 10th international conference on Intelligent user 

interfaces IUI '05 
Publisher: ACM Press 

p~ 11 . . *i ui « AtfA od\ Additional Information: full citation, abstract, references Jndejcjerms, 

Full text available: lm pdf(1 -11 MB) — ; — — 

' " review 

Much intelligent user interfaces research addresses the problem of providing information 
relevant to a current user topic. However, little work addresses the complementary 
question of helping the user identify potential topics to explore next. In knowledge 
acquisition, this question is crucial to deciding how to extend previously-captured 
knowledge. This paper examines requirements for effective topic suggestion and presents 
a domain-independent topic-generation algorithm designed to generate ca ... 

Keywords: automatic topic search, concept mapping, context, human-centered 
knowledge acquisition tools 
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Jana Kosecka, Yi Ma, Stefano Soatto, Rene Vidal 
<f> August 2004 ACM SIGGRAPH 2004 Course Notes SIGGRAPH '04 

Publisher: ACM Press 

Full text available: ^| pdf(23.14 MB) Additional Information: full citation , abstract 

This course presents the state of the art in multiple-view geometry, including methods 
and algorithms for reconstructing 3-D geometric models of scenes from video or 
photographs. This course is based on a novel approach to multiple-view geometry that 
only requires linear algebra, as opposed to more involved projective and algebraic 
geometry that most current methods employ. This new approach aims to make image- 
based modeling techniques accessible to a larger audience compared to existing ones. 
T ... 



14 Terminology-based knowledge mining for new knowledge discovery Q 
Hideki Mima, Sophia Ananiadou, Katsumori Matsushima 

March 2006 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 5 Issue 1 
Publisher: ACM Press 

Full text available: ^pdf(1.36 MB) Additional Information: full citation , abstract , r eferences , index te rm s 

In this article we present an integrated knowledge-mining system for the domain of 
biomedicine, in which automatic term recognition, term clustering, information retrieval, 
and visualization are combined. The primary objective of this system is to facilitate 
knowledge acquisition from documents and aid knowledge discovery through terminology- 
based similarity calculation and visualization of automatically structured knowledge. This 
system also supports the integration of different types of databa ... 
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Most previous work on the recently developed language-modeling approach to information 
retrieval focuses on document-specific characteristics, and therefore does not take into 
account the structure of the surrounding corpus. We propose a novel algorithmic 
framework in which information provided by document-based language models is 
enhanced by the incorporation of information drawn from clusters of similar documents. 
Using this framework, we develop a suite of new algorithms. Even t ... 
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This paper introduces the problem of combining multiple partitionings of a set of objects 
into a single consolidated clustering without accessing the features or algorithms that 
determined these partitionings. We first identify several application scenarios for the 
resultant 'knowledge reuse' framework that we call cluster ensembles. The cluster 
ensemble problem is then formalized as a combinatorial optimization problem in terms of 
shared mutual information. In addition to a direct ... 
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A hierarchical structure can provide efficient access to information contained in a 
collection of documents. However, such a structure is not always available, e.g. for a set 
of documents a user has collected over time in a single folder or the results of a web 
search. We therefore investigate in this paper how we can obtain a hierarchical structure 
automatically, taking into account some background knowledge about the way a specific 
user would structure the collection. More specifically, we ada ... 
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We address the problem of clustering multimodal group actions in meetings using a two- 
layer HMM framework. Meetings are structured as sequences of group actions. Our 
approach aims at creating one cluster for each group action, where the number of group 
actions and the action boundaries are unknown a priori. In our framework, the first layer 
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models typical actions of individuals in meetings using supervised HMM learning and low- 
level audio-visual features. A number of options that explicitly m ... 

Keywords: automatic meeting analysis, multi-person event modeling, multi-sensor 
networks 
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